Predict Bike Sharing Demand with AutoGluon Template¶
Project: Predict Bike Sharing Demand with AutoGluon¶
This notebook is a template with each step that you need to complete for the project.
Please fill in your code where there are explicit ? markers in the notebook. You are welcome to add more cells and code as you see fit.
Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.
File-> Export Notebook As... -> Export Notebook as HTML
There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.
Completing the code template and writeup template will cover all of the rubric points for this project.
The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.
Step 1: Create an account with Kaggle¶
Create Kaggle Account and download API key¶
Below is example of steps to get the API username and key. Each student will have their own username and key.
- Open account settings.
- Scroll down to API and click Create New API Token.
- Open up
kaggle.jsonand use the username and key.
Step 2: Download the Kaggle dataset using the kaggle python library¶
Open up Sagemaker Studio and use starter template¶
- Notebook should be using a
ml.t3.mediuminstance (2 vCPU + 4 GiB) - Notebook should be using kernal:
Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)
Install packages¶
!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
# Without --no-cache-dir, smaller aws instances may have trouble installing
Requirement already satisfied: pip in /opt/conda/lib/python3.10/site-packages (23.3.2)
Collecting pip
Using cached pip-24.0-py3-none-any.whl.metadata (3.6 kB)
Using cached pip-24.0-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 23.3.2
Uninstalling pip-23.3.2:
Successfully uninstalled pip-23.3.2
Successfully installed pip-24.0
Requirement already satisfied: setuptools in /opt/conda/lib/python3.10/site-packages (69.5.1)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (0.43.0)
Collecting mxnet<2.0.0
Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting bokeh==2.0.1
Using cached bokeh-2.0.1-py3-none-any.whl
Requirement already satisfied: PyYAML>=3.10 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (6.0.1)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (2.9.0)
Requirement already satisfied: Jinja2>=2.7 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (3.1.3)
Requirement already satisfied: numpy>=1.11.3 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (1.26.4)
Requirement already satisfied: pillow>=4.0 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (9.5.0)
Requirement already satisfied: packaging>=16.8 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (23.2)
Requirement already satisfied: tornado>=5 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (6.4)
Requirement already satisfied: typing-extensions>=3.7.4 in /opt/conda/lib/python3.10/site-packages (from bokeh==2.0.1) (4.5.0)
Requirement already satisfied: requests<3,>=2.20.0 in /opt/conda/lib/python3.10/site-packages (from mxnet<2.0.0) (2.31.0)
Collecting graphviz<0.9.0,>=0.8.1 (from mxnet<2.0.0)
Using cached graphviz-0.8.4-py2.py3-none-any.whl.metadata (6.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from Jinja2>=2.7->bokeh==2.0.1) (2.1.5)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.1->bokeh==2.0.1) (1.16.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2024.2.2)
Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl (49.1 MB)
Using cached graphviz-0.8.4-py2.py3-none-any.whl (16 kB)
Installing collected packages: graphviz, mxnet, bokeh
Attempting uninstall: graphviz
Found existing installation: graphviz 0.20.3
Uninstalling graphviz-0.20.3:
Successfully uninstalled graphviz-0.20.3
Successfully installed bokeh-2.0.1 graphviz-0.8.4 mxnet-1.9.1
Requirement already satisfied: autogluon in /opt/conda/lib/python3.10/site-packages (0.8.2)
Requirement already satisfied: autogluon.core==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.core[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: autogluon.features==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon) (0.8.2)
Requirement already satisfied: autogluon.tabular==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: autogluon.multimodal==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon) (0.8.2)
Requirement already satisfied: autogluon.timeseries==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries[all]==0.8.2->autogluon) (0.8.2)
Requirement already satisfied: numpy<1.27,>=1.21 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.26.4)
Requirement already satisfied: scipy<1.12,>=1.5.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.11.4)
Requirement already satisfied: scikit-learn<1.5,>=1.3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.4.2)
Requirement already satisfied: networkx<4,>=3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.3)
Requirement already satisfied: pandas<2.2.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.1.4)
Requirement already satisfied: tqdm<5,>=4.38 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (4.66.2)
Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.31.0)
Requirement already satisfied: matplotlib in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.8.4)
Requirement already satisfied: boto3<2,>=1.10 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.34.51)
Requirement already satisfied: autogluon.common==0.8.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.8.2)
Collecting hyperopt<0.2.8,>=0.2.7 (from autogluon.core[all]==0.8.2->autogluon)
Downloading hyperopt-0.2.7-py2.py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: pydantic<2.0,>=1.10.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.core[all]==0.8.2->autogluon) (1.10.14)
Collecting ray<2.7,>=2.6.3 (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading ray-2.6.3-cp310-cp310-manylinux2014_x86_64.whl.metadata (12 kB)
Requirement already satisfied: Pillow<9.6,>=9.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (9.5.0)
Requirement already satisfied: torch<2.1,>=1.13 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.0.0.post101)
Requirement already satisfied: pytorch-lightning<2.1,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.0.9)
Requirement already satisfied: jsonschema<4.18,>=4.14 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (4.17.3)
Requirement already satisfied: seqeval<1.3.0,>=1.2.2 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.2.2)
Requirement already satisfied: evaluate<0.5.0,>=0.4.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.4.1)
Requirement already satisfied: accelerate<0.22.0,>=0.21.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.21.0)
Requirement already satisfied: transformers<4.32.0,>=4.31.0 in /opt/conda/lib/python3.10/site-packages (from transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (4.31.0)
Requirement already satisfied: timm<0.10.0,>=0.9.5 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.9.16)
Requirement already satisfied: torchvision<0.16.0,>=0.14.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.15.2a0+072ec57)
Requirement already satisfied: scikit-image<0.20.0,>=0.19.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.19.3)
Requirement already satisfied: text-unidecode<1.4,>=1.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.3)
Requirement already satisfied: torchmetrics<1.1.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.0.3)
Requirement already satisfied: nptyping<2.5.0,>=1.4.4 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.4.1)
Requirement already satisfied: omegaconf<2.3.0,>=2.1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.2.3)
Requirement already satisfied: pytorch-metric-learning<2.0,>=1.3.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.7.3)
Requirement already satisfied: nlpaug<1.2.0,>=1.1.10 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (1.1.11)
Requirement already satisfied: nltk<4.0.0,>=3.4.5 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (3.8.1)
Requirement already satisfied: openmim<0.4.0,>=0.3.7 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.3.7)
Requirement already satisfied: defusedxml<0.7.2,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.7.1)
Requirement already satisfied: jinja2<3.2,>=3.0.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (3.1.3)
Requirement already satisfied: tensorboard<3,>=2.9 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (2.12.3)
Requirement already satisfied: pytesseract<0.3.11,>=0.3.9 in /opt/conda/lib/python3.10/site-packages (from autogluon.multimodal==0.8.2->autogluon) (0.3.10)
Requirement already satisfied: catboost<1.3,>=1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (1.2.3)
Requirement already satisfied: xgboost<1.8,>=1.6 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (1.7.6)
Requirement already satisfied: fastai<2.8,>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (2.7.14)
Requirement already satisfied: lightgbm<3.4,>=3.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.tabular[all]==0.8.2->autogluon) (3.3.5)
Requirement already satisfied: joblib<2,>=1.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (1.4.0)
Requirement already satisfied: statsmodels<0.15,>=0.13.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.14.1)
Requirement already satisfied: gluonts<0.14,>=0.13.1 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.13.7)
Requirement already satisfied: statsforecast<1.5,>=1.4.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (1.4.0)
Requirement already satisfied: mlforecast<0.7.4,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.7.3)
Requirement already satisfied: ujson<6,>=5 in /opt/conda/lib/python3.10/site-packages (from autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (5.9.0)
Requirement already satisfied: psutil<6,>=5.7.3 in /opt/conda/lib/python3.10/site-packages (from autogluon.common==0.8.2->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (5.9.8)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.10/site-packages (from autogluon.common==0.8.2->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (69.5.1)
Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from accelerate<0.22.0,>=0.21.0->autogluon.multimodal==0.8.2->autogluon) (23.2)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from accelerate<0.22.0,>=0.21.0->autogluon.multimodal==0.8.2->autogluon) (6.0.1)
Requirement already satisfied: botocore<1.35.0,>=1.34.51 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.34.51)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.0.1)
Requirement already satisfied: s3transfer<0.11.0,>=0.10.0 in /opt/conda/lib/python3.10/site-packages (from boto3<2,>=1.10->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.10.1)
Requirement already satisfied: graphviz in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (0.8.4)
Requirement already satisfied: plotly in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (5.19.0)
Requirement already satisfied: six in /opt/conda/lib/python3.10/site-packages (from catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (1.16.0)
Requirement already satisfied: datasets>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (2.18.0)
Requirement already satisfied: dill in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.3.8)
Requirement already satisfied: xxhash in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (3.4.1)
Requirement already satisfied: multiprocess in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.70.16)
Requirement already satisfied: fsspec>=2021.05.0 in /opt/conda/lib/python3.10/site-packages (from fsspec[http]>=2021.05.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (2023.6.0)
Requirement already satisfied: huggingface-hub>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.22.2)
Requirement already satisfied: responses<0.19 in /opt/conda/lib/python3.10/site-packages (from evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.18.0)
Requirement already satisfied: pip in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (24.0)
Requirement already satisfied: fastdownload<2,>=0.0.5 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.0.7)
Requirement already satisfied: fastcore<1.6,>=1.5.29 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.5.29)
Requirement already satisfied: fastprogress>=0.2.4 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.3)
Requirement already satisfied: spacy<4 in /opt/conda/lib/python3.10/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.7.3)
Requirement already satisfied: toolz~=0.10 in /opt/conda/lib/python3.10/site-packages (from gluonts<0.14,>=0.13.1->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.12.1)
Requirement already satisfied: typing-extensions~=4.0 in /opt/conda/lib/python3.10/site-packages (from gluonts<0.14,>=0.13.1->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (4.5.0)
Requirement already satisfied: future in /opt/conda/lib/python3.10/site-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon) (1.0.0)
Requirement already satisfied: cloudpickle in /opt/conda/lib/python3.10/site-packages (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon) (2.2.1)
Collecting py4j (from hyperopt<0.2.8,>=0.2.7->autogluon.core[all]==0.8.2->autogluon)
Downloading py4j-0.10.9.7-py2.py3-none-any.whl.metadata (1.5 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2<3.2,>=3.0.3->autogluon.multimodal==0.8.2->autogluon) (2.1.5)
Requirement already satisfied: attrs>=17.4.0 in /opt/conda/lib/python3.10/site-packages (from jsonschema<4.18,>=4.14->autogluon.multimodal==0.8.2->autogluon) (23.2.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in /opt/conda/lib/python3.10/site-packages (from jsonschema<4.18,>=4.14->autogluon.multimodal==0.8.2->autogluon) (0.20.0)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (from lightgbm<3.4,>=3.3->autogluon.tabular[all]==0.8.2->autogluon) (0.43.0)
Requirement already satisfied: numba in /opt/conda/lib/python3.10/site-packages (from mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.59.1)
Requirement already satisfied: window-ops in /opt/conda/lib/python3.10/site-packages (from mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.0.15)
Requirement already satisfied: gdown>=4.0.0 in /opt/conda/lib/python3.10/site-packages (from nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (5.1.0)
Requirement already satisfied: click in /opt/conda/lib/python3.10/site-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==0.8.2->autogluon) (8.1.7)
Requirement already satisfied: regex>=2021.8.3 in /opt/conda/lib/python3.10/site-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==0.8.2->autogluon) (2023.12.25)
Requirement already satisfied: antlr4-python3-runtime==4.9.* in /opt/conda/lib/python3.10/site-packages (from omegaconf<2.3.0,>=2.1.1->autogluon.multimodal==0.8.2->autogluon) (4.9.3)
Requirement already satisfied: colorama in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.4.6)
Requirement already satisfied: model-index in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.1.11)
Requirement already satisfied: rich in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (13.7.1)
Requirement already satisfied: tabulate in /opt/conda/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.9.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2.9.0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /opt/conda/lib/python3.10/site-packages (from pandas<2.2.0,>=2.0.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2024.1)
Requirement already satisfied: lightning-utilities>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from pytorch-lightning<2.1,>=2.0.0->autogluon.multimodal==0.8.2->autogluon) (0.11.2)
Requirement already satisfied: filelock in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (3.13.4)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.0.7)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (4.21.12)
Requirement already satisfied: aiosignal in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.3.1)
Requirement already satisfied: frozenlist in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.4.1)
Requirement already satisfied: grpcio>=1.42.0 in /opt/conda/lib/python3.10/site-packages (from ray<2.7,>=2.6.3->ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.54.3)
Requirement already satisfied: aiohttp>=3.7 in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (3.9.3)
Collecting aiohttp-cors (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading aiohttp_cors-0.7.0-py3-none-any.whl.metadata (20 kB)
Collecting colorful (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading colorful-0.5.6-py2.py3-none-any.whl.metadata (16 kB)
Collecting py-spy>=0.2.0 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (16 kB)
Collecting gpustat>=1.0.0 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading gpustat-1.1.1.tar.gz (98 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.1/98.1 kB 39.0 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Collecting opencensus (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading opencensus-0.11.4-py2.py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: prometheus-client>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (0.20.0)
Requirement already satisfied: smart-open in /opt/conda/lib/python3.10/site-packages (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (5.2.1)
Collecting virtualenv<20.21.1,>=20.0.24 (from ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading virtualenv-20.21.0-py3-none-any.whl.metadata (4.1 kB)
Collecting tensorboardX>=1.9 (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl.metadata (5.8 kB)
Requirement already satisfied: pyarrow>=6.0.1 in /opt/conda/lib/python3.10/site-packages (from ray[tune]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (12.0.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (2024.2.2)
Requirement already satisfied: imageio>=2.4.1 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (2.34.0)
Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (2024.2.12)
Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/lib/python3.10/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.8.2->autogluon) (1.4.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from scikit-learn<1.5,>=1.3.0->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.4.0)
Requirement already satisfied: patsy>=0.5.4 in /opt/conda/lib/python3.10/site-packages (from statsmodels<0.15,>=0.13.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.5.6)
Requirement already satisfied: absl-py>=0.4 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.1.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.29.0)
Requirement already satisfied: google-auth-oauthlib<1.1,>=0.5 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (1.0.0)
Requirement already satisfied: markdown>=2.6.8 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.6)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.7.0)
Requirement already satisfied: werkzeug>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.0.2)
Requirement already satisfied: safetensors in /opt/conda/lib/python3.10/site-packages (from timm<0.10.0,>=0.9.5->autogluon.multimodal==0.8.2->autogluon) (0.4.2)
Requirement already satisfied: sympy in /opt/conda/lib/python3.10/site-packages (from torch<2.1,>=1.13->autogluon.multimodal==0.8.2->autogluon) (1.12)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.10/site-packages (from transformers<4.32.0,>=4.31.0->transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (0.13.3)
Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /opt/conda/lib/python3.10/site-packages (from transformers[sentencepiece]<4.32.0,>=4.31.0->autogluon.multimodal==0.8.2->autogluon) (0.1.99)
Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.2.1)
Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (4.51.0)
Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (1.4.5)
Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib->autogluon.core==0.8.2->autogluon.core[all]==0.8.2->autogluon) (3.1.2)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (4.0.3)
Requirement already satisfied: pyarrow-hotfix in /opt/conda/lib/python3.10/site-packages (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==0.8.2->autogluon) (0.6)
Requirement already satisfied: beautifulsoup4 in /opt/conda/lib/python3.10/site-packages (from gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (4.12.3)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (5.3.3)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.10/site-packages (from google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from google-auth-oauthlib<1.1,>=0.5->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (2.0.0)
Collecting nvidia-ml-py>=11.450.129 (from gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading nvidia_ml_py-12.550.52-py3-none-any.whl.metadata (8.6 kB)
Collecting blessed>=1.17.1 (from gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading blessed-1.20.0-py2.py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /opt/conda/lib/python3.10/site-packages (from numba->mlforecast<0.7.4,>=0.7.0->autogluon.timeseries==0.8.2->autogluon.timeseries[all]==0.8.2->autogluon) (0.42.0)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (8.2.2)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.9.4)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/conda/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (3.3.0)
Collecting distlib<1,>=0.3.6 (from virtualenv<20.21.1,>=20.0.24->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading distlib-0.3.8-py2.py3-none-any.whl.metadata (5.1 kB)
Collecting platformdirs<4,>=2.4 (from virtualenv<20.21.1,>=20.0.24->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading platformdirs-3.11.0-py3-none-any.whl.metadata (11 kB)
Requirement already satisfied: ordered-set in /opt/conda/lib/python3.10/site-packages (from model-index->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (4.1.0)
Collecting opencensus-context>=0.1.3 (from opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading opencensus_context-0.1.3-py2.py3-none-any.whl.metadata (3.3 kB)
Collecting google-api-core<3.0.0,>=1.0.0 (from opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading google_api_core-2.18.0-py3-none-any.whl.metadata (2.7 kB)
Requirement already satisfied: tenacity>=6.2.0 in /opt/conda/lib/python3.10/site-packages (from plotly->catboost<1.3,>=1.1->autogluon.tabular[all]==0.8.2->autogluon) (8.2.3)
Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/conda/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (2.17.2)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/lib/python3.10/site-packages (from sympy->torch<2.1,>=1.13->autogluon.multimodal==0.8.2->autogluon) (1.3.0)
Requirement already satisfied: wcwidth>=0.1.4 in /opt/conda/lib/python3.10/site-packages (from blessed>=1.17.1->gpustat>=1.0.0->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon) (0.2.13)
Collecting googleapis-common-protos<2.0.dev0,>=1.56.2 (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading googleapis_common_protos-1.63.0-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting proto-plus<2.0.0dev,>=1.22.3 (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<2.7,>=2.6.3; extra == "all"->autogluon.core[all]==0.8.2->autogluon)
Downloading proto_plus-1.23.0-py3-none-any.whl.metadata (2.2 kB)
Requirement already satisfied: mdurl~=0.1 in /opt/conda/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==0.8.2->autogluon) (0.1.2)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /opt/conda/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (0.5.1)
Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.10/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<3,>=2.9->autogluon.multimodal==0.8.2->autogluon) (3.2.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/conda/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.7.10)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/conda/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.1.4)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /opt/conda/lib/python3.10/site-packages (from weasel<0.4.0,>=0.1.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.8.2->autogluon) (0.16.0)
Requirement already satisfied: soupsieve>1.2 in /opt/conda/lib/python3.10/site-packages (from beautifulsoup4->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (2.5)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /opt/conda/lib/python3.10/site-packages (from requests[socks]->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==0.8.2->autogluon) (1.7.1)
Downloading hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 208.7 MB/s eta 0:00:00
Downloading ray-2.6.3-cp310-cp310-manylinux2014_x86_64.whl (56.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.9/56.9 MB 134.9 MB/s eta 0:00:00a 0:00:01
Downloading py_spy-0.3.14-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 265.5 MB/s eta 0:00:00
Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 101.7/101.7 kB 275.0 MB/s eta 0:00:00
Downloading virtualenv-20.21.0-py3-none-any.whl (8.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.7/8.7 MB 242.7 MB/s eta 0:00:0000:01
Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Downloading colorful-0.5.6-py2.py3-none-any.whl (201 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 201.4/201.4 kB 366.4 MB/s eta 0:00:00
Downloading opencensus-0.11.4-py2.py3-none-any.whl (128 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 128.2/128.2 kB 269.9 MB/s eta 0:00:00
Downloading py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.5/200.5 kB 377.0 MB/s eta 0:00:00
Downloading blessed-1.20.0-py2.py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.4/58.4 kB 247.0 MB/s eta 0:00:00
Downloading distlib-0.3.8-py2.py3-none-any.whl (468 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.9/468.9 kB 343.2 MB/s eta 0:00:00
Downloading google_api_core-2.18.0-py3-none-any.whl (138 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 138.3/138.3 kB 290.7 MB/s eta 0:00:00
Downloading nvidia_ml_py-12.550.52-py3-none-any.whl (39 kB)
Downloading opencensus_context-0.1.3-py2.py3-none-any.whl (5.1 kB)
Downloading platformdirs-3.11.0-py3-none-any.whl (17 kB)
Downloading googleapis_common_protos-1.63.0-py2.py3-none-any.whl (229 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.1/229.1 kB 305.2 MB/s eta 0:00:00
Downloading proto_plus-1.23.0-py3-none-any.whl (48 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 kB 265.0 MB/s eta 0:00:00
Building wheels for collected packages: gpustat
Building wheel for gpustat (pyproject.toml) ... done
Created wheel for gpustat: filename=gpustat-1.1.1-py3-none-any.whl size=26532 sha256=aa4b8348de44c46abea7798d3ae46f8adb523c049aa19094e762fbcfe14ab2fc
Stored in directory: /tmp/pip-ephem-wheel-cache-sgjyrp8c/wheels/ec/d7/80/a71ba3540900e1f276bcae685efd8e590c810d2108b95f1e47
Successfully built gpustat
Installing collected packages: py4j, py-spy, opencensus-context, nvidia-ml-py, distlib, colorful, tensorboardX, proto-plus, platformdirs, googleapis-common-protos, blessed, virtualenv, ray, hyperopt, gpustat, google-api-core, aiohttp-cors, opencensus
Attempting uninstall: platformdirs
Found existing installation: platformdirs 4.2.0
Uninstalling platformdirs-4.2.0:
Successfully uninstalled platformdirs-4.2.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sparkmagic 0.21.0 requires pandas<2.0.0,>=0.17.1, but you have pandas 2.1.4 which is incompatible.
Successfully installed aiohttp-cors-0.7.0 blessed-1.20.0 colorful-0.5.6 distlib-0.3.8 google-api-core-2.18.0 googleapis-common-protos-1.63.0 gpustat-1.1.1 hyperopt-0.2.7 nvidia-ml-py-12.550.52 opencensus-0.11.4 opencensus-context-0.1.3 platformdirs-3.11.0 proto-plus-1.23.0 py-spy-0.3.14 py4j-0.10.9.7 ray-2.6.3 tensorboardX-2.6.2.2 virtualenv-20.21.0
Requirement already satisfied: gdown>=4.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (5.1.0) Requirement already satisfied: click in /usr/lib/python3/dist-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==1.1.0->autogluon) (8.0.3) Requirement already satisfied: regex>=2021.8.3 in /usr/lib/python3/dist-packages (from nltk<4.0.0,>=3.4.5->autogluon.multimodal==1.1.0->autogluon) (2021.11.10) Requirement already satisfied: antlr4-python3-runtime==4.9.* in /home/zafar/.local/lib/python3.10/site-packages (from omegaconf<2.3.0,>=2.1.1->autogluon.multimodal==1.1.0->autogluon) (4.9.3) Requirement already satisfied: colorama in /usr/lib/python3/dist-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.4.4) Requirement already satisfied: model-index in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.1.11) Requirement already satisfied: opendatalab in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.0.10) Requirement already satisfied: rich in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (13.4.2) Requirement already satisfied: tabulate in /home/zafar/.local/lib/python3.10/site-packages (from openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.9.0) Requirement already satisfied: coloredlogs in /home/zafar/.local/lib/python3.10/site-packages (from optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (15.0.1) Requirement already satisfied: sympy in /usr/lib/python3/dist-packages (from optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.9) Requirement already satisfied: onnx in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.16.0) Requirement already satisfied: onnxruntime>=1.11.0 in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (1.17.3) Requirement already satisfied: protobuf>=3.20.1 in /home/zafar/.local/lib/python3.10/site-packages (from optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (4.25.3) Requirement already satisfied: python-dateutil>=2.8.2 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2023.4) Requirement already satisfied: tzdata>=2022.7 in /home/zafar/.local/lib/python3.10/site-packages (from pandas<2.3.0,>=2.0.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2024.1) Requirement already satisfied: filelock in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (3.13.4) Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/lib/python3/dist-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.0.3) Requirement already satisfied: aiosignal in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.3.1) Requirement already satisfied: frozenlist in /home/zafar/.local/lib/python3.10/site-packages (from ray<2.11,>=2.10.0->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.4.1) Requirement already satisfied: aiohttp>=3.7 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (3.9.5) Requirement already satisfied: aiohttp-cors in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.7.0) Requirement already satisfied: colorful in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.5.6) Requirement already satisfied: py-spy>=0.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.3.14) Requirement already satisfied: opencensus in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.11.4) Requirement already satisfied: prometheus-client>=0.7.1 in /usr/lib/python3/dist-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.9.0) Requirement already satisfied: smart-open in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (6.4.0) Requirement already satisfied: virtualenv!=20.21.1,>=20.0.24 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (20.26.0) Requirement already satisfied: grpcio>=1.42.0 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.62.2) Requirement already satisfied: tensorboardX>=1.9 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.6.2.2) Requirement already satisfied: pyarrow>=6.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (16.0.0) Requirement already satisfied: charset-normalizer<4,>=2 in /home/zafar/.local/lib/python3.10/site-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.3) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (1.26.5) Requirement already satisfied: certifi>=2017.4.17 in /home/zafar/.local/lib/python3.10/site-packages (from requests->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (2024.2.2) Requirement already satisfied: imageio>=2.4.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (2.34.1) Requirement already satisfied: tifffile>=2019.7.26 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (2024.4.24) Requirement already satisfied: PyWavelets>=1.1.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (1.6.0) Requirement already satisfied: lazy_loader>=0.1 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-image<0.21.0,>=0.19.1->autogluon.multimodal==1.1.0->autogluon) (0.4) Requirement already satisfied: threadpoolctl>=2.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from scikit-learn<1.4.1,>=1.3.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) (3.4.0) Requirement already satisfied: statsmodels>=0.13.2 in /home/zafar/.local/lib/python3.10/site-packages (from statsforecast<1.5,>=1.4.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.14.2) Requirement already satisfied: absl-py>=0.4 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (2.1.0) Requirement already satisfied: markdown>=2.6.8 in /usr/lib/python3/dist-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (3.3.6) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (0.7.2) Requirement already satisfied: werkzeug>=1.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (3.0.2)
Requirement already satisfied: safetensors in /home/zafar/.local/lib/python3.10/site-packages (from timm<0.10.0,>=0.9.5->autogluon.multimodal==1.1.0->autogluon) (0.4.3) Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105) Requirement already satisfied: nvidia-cuda-runtime-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105) Requirement already satisfied: nvidia-cuda-cupti-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105) Requirement already satisfied: nvidia-cudnn-cu12==8.9.2.26 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (8.9.2.26) Requirement already satisfied: nvidia-cublas-cu12==12.1.3.1 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.3.1) Requirement already satisfied: nvidia-cufft-cu12==11.0.2.54 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (11.0.2.54) Requirement already satisfied: nvidia-curand-cu12==10.3.2.106 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (10.3.2.106) Requirement already satisfied: nvidia-cusolver-cu12==11.4.5.107 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (11.4.5.107) Requirement already satisfied: nvidia-cusparse-cu12==12.1.0.106 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.0.106) Requirement already satisfied: nvidia-nccl-cu12==2.18.1 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (2.18.1) Requirement already satisfied: nvidia-nvtx-cu12==12.1.105 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.1.105) Requirement already satisfied: triton==2.1.0 in /home/zafar/.local/lib/python3.10/site-packages (from torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (2.1.0) Requirement already satisfied: nvidia-nvjitlink-cu12 in /home/zafar/.local/lib/python3.10/site-packages (from nvidia-cusolver-cu12==11.4.5.107->torch<2.2,>=2.1->autogluon.multimodal==1.1.0->autogluon) (12.4.127) Requirement already satisfied: tokenizers<0.19,>=0.14 in /home/zafar/.local/lib/python3.10/site-packages (from transformers<4.39.0,>=4.38.0->transformers[sentencepiece]<4.39.0,>=4.38.0->autogluon.multimodal==1.1.0->autogluon) (0.15.2) Requirement already satisfied: sentencepiece!=0.1.92,>=0.1.91 in /home/zafar/.local/lib/python3.10/site-packages (from transformers[sentencepiece]<4.39.0,>=4.38.0->autogluon.multimodal==1.1.0->autogluon) (0.2.0) Requirement already satisfied: multidict<7.0,>=4.5 in /home/zafar/.local/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (6.0.5) Requirement already satisfied: yarl<2.0,>=1.0 in /home/zafar/.local/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.9.4) Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/lib/python3/dist-packages (from aiohttp>=3.7->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.0.1) Requirement already satisfied: pyarrow-hotfix in /home/zafar/.local/lib/python3.10/site-packages (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.0->autogluon.multimodal==1.1.0->autogluon) (0.6) Requirement already satisfied: beautifulsoup4 in /usr/lib/python3/dist-packages (from gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (4.10.0) Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /home/zafar/.local/lib/python3.10/site-packages (from numba->mlforecast<0.10.1,>=0.10.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.42.0) Requirement already satisfied: flatbuffers in /home/zafar/.local/lib/python3.10/site-packages (from onnxruntime>=1.11.0->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (24.3.25) Requirement already satisfied: annotated-types>=0.4.0 in /home/zafar/.local/lib/python3.10/site-packages (from pydantic<3,>=1.7->gluonts<0.14.4,>=0.14.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.6.0) Requirement already satisfied: pydantic-core==2.18.2 in /home/zafar/.local/lib/python3.10/site-packages (from pydantic<3,>=1.7->gluonts<0.14.4,>=0.14.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (2.18.2) Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.0.12) Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.0.5) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.0.10) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.0.8) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.0.9) Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (8.2.3) Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.1.2) Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.4.8) Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (2.0.10) Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.3.4) Requirement already satisfied: typer<0.10.0,>=0.3.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.9.4) Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (3.4.0) Requirement already satisfied: patsy>=0.5.6 in /home/zafar/.local/lib/python3.10/site-packages (from statsmodels>=0.13.2->statsforecast<1.5,>=1.4.0->autogluon.timeseries==1.1.0->autogluon.timeseries[all]==1.1.0->autogluon) (0.5.6) Requirement already satisfied: distlib<1,>=0.3.7 in /home/zafar/.local/lib/python3.10/site-packages (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.3.8) Requirement already satisfied: platformdirs<5,>=3.9.1 in /home/zafar/.local/lib/python3.10/site-packages (from virtualenv!=20.21.1,>=20.0.24->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.2.1) Requirement already satisfied: MarkupSafe>=2.1.1 in /home/zafar/.local/lib/python3.10/site-packages (from werkzeug>=1.0.1->tensorboard<3,>=2.9->autogluon.multimodal==1.1.0->autogluon) (2.1.5) Requirement already satisfied: humanfriendly>=9.1 in /home/zafar/.local/lib/python3.10/site-packages (from coloredlogs->optimum<1.19,>=1.17->optimum[onnxruntime]<1.19,>=1.17; extra == "all"->autogluon.timeseries[all]==1.1.0->autogluon) (10.0)
Requirement already satisfied: ordered-set in /home/zafar/.local/lib/python3.10/site-packages (from model-index->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (4.1.0) Requirement already satisfied: opencensus-context>=0.1.3 in /home/zafar/.local/lib/python3.10/site-packages (from opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.1.3) Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.18.0) Requirement already satisfied: pycryptodome in /home/zafar/.local/lib/python3.10/site-packages (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.20.0) Requirement already satisfied: openxlab in /home/zafar/.local/lib/python3.10/site-packages (from opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.0.38) Requirement already satisfied: tenacity>=6.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from plotly->catboost<1.3,>=1.1->autogluon.tabular[all]==1.1.0->autogluon) (8.2.3) Requirement already satisfied: markdown-it-py>=2.2.0 in /home/zafar/.local/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/zafar/.local/lib/python3.10/site-packages (from rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.17.2) Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.63.0) Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (1.23.0) Requirement already satisfied: google-auth<3.0.dev0,>=2.14.1 in /home/zafar/.local/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (2.29.0) Requirement already satisfied: language-data>=1.2 in /home/zafar/.local/lib/python3.10/site-packages (from langcodes<4.0.0,>=3.2.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.2.0) Requirement already satisfied: mdurl~=0.1 in /home/zafar/.local/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (0.1.2) Requirement already satisfied: blis<0.8.0,>=0.7.8 in /home/zafar/.local/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.7.11) Requirement already satisfied: confection<1.0.0,>=0.0.1 in /home/zafar/.local/lib/python3.10/site-packages (from thinc<8.3.0,>=8.2.2->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.1.4) Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /home/zafar/.local/lib/python3.10/site-packages (from weasel<0.4.0,>=0.1.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (0.16.0) Requirement already satisfied: oss2~=2.17.0 in /home/zafar/.local/lib/python3.10/site-packages (from openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.17.0) Collecting setuptools (from autogluon.common==1.1.0->autogluon.core==1.1.0->autogluon.core[all]==1.1.0->autogluon) Downloading setuptools-60.2.0-py3-none-any.whl.metadata (5.1 kB) Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /home/zafar/.local/lib/python3.10/site-packages (from requests[socks]->gdown>=4.0.0->nlpaug<1.2.0,>=1.1.10->autogluon.multimodal==1.1.0->autogluon) (1.7.1) Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (5.3.3) Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.4.0) Requirement already satisfied: rsa<5,>=3.1.4 in /home/zafar/.local/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (4.9) Requirement already satisfied: marisa-trie>=0.7.7 in /home/zafar/.local/lib/python3.10/site-packages (from language-data>=1.2->langcodes<4.0.0,>=3.2.0->spacy<4->fastai<2.8,>=2.3.1->autogluon.tabular[all]==1.1.0->autogluon) (1.1.0) Requirement already satisfied: crcmod>=1.7 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (1.7) Requirement already satisfied: aliyun-python-sdk-kms>=2.4.1 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.16.2) Requirement already satisfied: aliyun-python-sdk-core>=2.13.12 in /home/zafar/.local/lib/python3.10/site-packages (from oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (2.15.1) Requirement already satisfied: cryptography>=2.6.0 in /usr/lib/python3/dist-packages (from aliyun-python-sdk-core>=2.13.12->oss2~=2.17.0->openxlab->opendatalab->openmim<0.4.0,>=0.3.7->autogluon.multimodal==1.1.0->autogluon) (3.4.8) Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /home/zafar/.local/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default,tune]<2.11,>=2.10.0; extra == "all"->autogluon.core[all]==1.1.0->autogluon) (0.6.0) Downloading setuptools-60.2.0-py3-none-any.whl (953 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 953.1/953.1 kB 807.8 kB/s eta 0:00:00 kB/s eta 0:00:01:01 Installing collected packages: setuptools Attempting uninstall: setuptools Found existing installation: setuptools 69.5.1 Uninstalling setuptools-69.5.1: Successfully uninstalled setuptools-69.5.1 Successfully installed setuptools-60.2.0
Setup Kaggle API Key¶
# create the .kaggle directory and an empty kaggle.json file
!mkdir -p ~/.kaggle
!touch ~/.kaggle/kaggle.json
!chmod 600 ~/.kaggle/kaggle.json
import os
import json
# Get the user's home directory
home_dir = os.path.expanduser("~")
# Fill in your user name and key from creating the Kaggle account and API token file
kaggle_username = "zafarabdugaffarov"
kaggle_key = "2057716522fbae8f5161b48d587f51d6"
# Create the .kaggle directory if it doesn't exist
kaggle_dir = os.path.join(home_dir, ".kaggle")
if not os.path.exists(kaggle_dir):
os.makedirs(kaggle_dir)
# Save API token to the kaggle.json file
kaggle_json_path = os.path.join(kaggle_dir, "kaggle.json")
with open(kaggle_json_path, "w") as f:
json.dump({"username": kaggle_username, "key": kaggle_key}, f)
# Set appropriate permissions
os.chmod(kaggle_json_path, 0o600)
Download and explore dataset¶
Go to the bike sharing demand competition and agree to the terms¶
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
!unzip -o bike-sharing-demand.zip
/bin/bash: line 1: kaggle: command not found unzip: cannot find or open bike-sharing-demand.zip, bike-sharing-demand.zip.zip or bike-sharing-demand.zip.ZIP.
import pandas as pd
from autogluon.tabular import TabularPredictor
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv('train.csv')
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 |
# Simple output of the train dataset to view some of the min/max/varition of the dataset features.
train.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.00000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 |
| mean | 2.506614 | 0.028569 | 0.680875 | 1.418427 | 20.23086 | 23.655084 | 61.886460 | 12.799395 | 36.021955 | 155.552177 | 191.574132 |
| std | 1.116174 | 0.166599 | 0.466159 | 0.633839 | 7.79159 | 8.474601 | 19.245033 | 8.164537 | 49.960477 | 151.039033 | 181.144454 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.82000 | 0.760000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.94000 | 16.665000 | 47.000000 | 7.001500 | 4.000000 | 36.000000 | 42.000000 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 20.50000 | 24.240000 | 62.000000 | 12.998000 | 17.000000 | 118.000000 | 145.000000 |
| 75% | 4.000000 | 0.000000 | 1.000000 | 2.000000 | 26.24000 | 31.060000 | 77.000000 | 16.997900 | 49.000000 | 222.000000 | 284.000000 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 41.00000 | 45.455000 | 100.000000 | 56.996900 | 367.000000 | 886.000000 | 977.000000 |
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv('test.csv')
test.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-20 00:00:00 | 1 | 0 | 1 | 1 | 10.66 | 11.365 | 56 | 26.0027 |
| 1 | 2011-01-20 01:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 2 | 2011-01-20 02:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 3 | 2011-01-20 03:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
| 4 | 2011-01-20 04:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
test.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|
| count | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 |
| mean | 2.493300 | 0.029108 | 0.685815 | 1.436778 | 20.620607 | 24.012865 | 64.125212 | 12.631157 |
| std | 1.091258 | 0.168123 | 0.464226 | 0.648390 | 8.059583 | 8.782741 | 19.293391 | 8.250151 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.820000 | 0.000000 | 16.000000 | 0.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.940000 | 16.665000 | 49.000000 | 7.001500 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 21.320000 | 25.000000 | 65.000000 | 11.001400 |
| 75% | 3.000000 | 0.000000 | 1.000000 | 2.000000 | 27.060000 | 31.060000 | 81.000000 | 16.997900 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 40.180000 | 50.000000 | 100.000000 | 55.998600 |
# Same thing as train and test dataset
submission = pd.read_csv('sampleSubmission.csv')
submission.head()
| datetime | count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 0 |
| 1 | 2011-01-20 01:00:00 | 0 |
| 2 | 2011-01-20 02:00:00 | 0 |
| 3 | 2011-01-20 03:00:00 | 0 |
| 4 | 2011-01-20 04:00:00 | 0 |
Step 3: Train a model using AutoGluon’s Tabular Prediction¶
Requirements:
- We are prediting
count, so it is the label we are setting. - Ignore
casualandregisteredcolumns as they are also not present in the test dataset. - Use the
root_mean_squared_erroras the metric to use for evaluation. - Set a time limit of 10 minutes (600 seconds).
- Use the preset
best_qualityto focus on creating the best model.
predictor = TabularPredictor(
label="count", problem_type="regression", eval_metric="rmse"
).fit(
train_data=train.drop(['casual', 'registered'], axis=1),
time_limit=600,
presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_105322"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_105322"
AutoGluon Version: 0.8.2
Python Version: 3.10.14
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail: 4.42 GB / 5.36 GB (82.4%)
WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception.
We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows: 10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context("mode.use_inf_as_na", True): # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2343.77 MB
Train Data (Original) Memory Usage: 1.52 MB (0.1% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
('object', ['datetime_as_object']) : 1 | ['datetime']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.1s = Fit runtime
9 features in original data used to generate 13 features in processed data.
Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.77s of the 599.81s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.06s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 396.29s of the 596.32s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 396.11s of the 596.14s of remaining time.
Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3`
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
/opt/conda/lib/python3.10/site-packages/dask/dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading.
warnings.warn(
/opt/conda/lib/python3.10/site-packages/dask/dataframe/__init__.py:31: FutureWarning:
Dask dataframe query planning is disabled because dask-expr is not installed.
You can install it with `pip install dask[dataframe]` or `conda install dask`.
This will raise in a future version.
warnings.warn(msg, FutureWarning)
[1000] valid_set's rmse: 131.684 [2000] valid_set's rmse: 130.67 [3000] valid_set's rmse: 130.626 [1000] valid_set's rmse: 135.592 [1000] valid_set's rmse: 133.481 [2000] valid_set's rmse: 132.323 [3000] valid_set's rmse: 131.618 [4000] valid_set's rmse: 131.443 [5000] valid_set's rmse: 131.265 [6000] valid_set's rmse: 131.277 [7000] valid_set's rmse: 131.443 [1000] valid_set's rmse: 128.503 [2000] valid_set's rmse: 127.654 [3000] valid_set's rmse: 127.227 [4000] valid_set's rmse: 127.105 [1000] valid_set's rmse: 134.135 [2000] valid_set's rmse: 132.272 [3000] valid_set's rmse: 131.286 [4000] valid_set's rmse: 130.752 [5000] valid_set's rmse: 130.363 [6000] valid_set's rmse: 130.509 [1000] valid_set's rmse: 136.168 [2000] valid_set's rmse: 135.138 [3000] valid_set's rmse: 135.029 [1000] valid_set's rmse: 134.061 [2000] valid_set's rmse: 133.034 [3000] valid_set's rmse: 132.182 [4000] valid_set's rmse: 131.997 [5000] valid_set's rmse: 131.643 [6000] valid_set's rmse: 131.504 [7000] valid_set's rmse: 131.574 [1000] valid_set's rmse: 132.912 [2000] valid_set's rmse: 131.703 [3000] valid_set's rmse: 131.117 [4000] valid_set's rmse: 130.82 [5000] valid_set's rmse: 130.673 [6000] valid_set's rmse: 130.708
-131.4609 = Validation score (-root_mean_squared_error) 51.98s = Training runtime 8.0s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 328.61s of the 528.64s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000] valid_set's rmse: 130.818 [1000] valid_set's rmse: 133.204 [1000] valid_set's rmse: 130.928 [1000] valid_set's rmse: 126.846 [1000] valid_set's rmse: 131.426 [1000] valid_set's rmse: 133.655 [1000] valid_set's rmse: 132.155 [1000] valid_set's rmse: 130.62
-131.0542 = Validation score (-root_mean_squared_error) 12.82s = Training runtime 1.39s = Validation runtime Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 312.64s of the 512.68s of remaining time. -116.5484 = Validation score (-root_mean_squared_error) 19.04s = Training runtime 0.82s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 291.92s of the 491.96s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Ran out of time, early stopping on iteration 4050. Ran out of time, early stopping on iteration 3990. Ran out of time, early stopping on iteration 4363. Ran out of time, early stopping on iteration 4403. Ran out of time, early stopping on iteration 4811. -130.5713 = Validation score (-root_mean_squared_error) 240.14s = Training runtime 0.11s = Validation runtime Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 51.49s of the 251.52s of remaining time. -124.6007 = Validation score (-root_mean_squared_error) 8.29s = Training runtime 0.51s = Validation runtime Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 42.04s of the 242.07s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Ran out of time, stopping training early. (Stopping on epoch 6) Ran out of time, stopping training early. (Stopping on epoch 7) Ran out of time, stopping training early. (Stopping on epoch 9) Ran out of time, stopping training early. (Stopping on epoch 11) Ran out of time, stopping training early. (Stopping on epoch 12) -140.3195 = Validation score (-root_mean_squared_error) 40.23s = Training runtime 0.54s = Validation runtime Fitting model: XGBoost_BAG_L1 ... Training model for up to 0.95s of the 200.98s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Time limit exceeded... Skipping XGBoost_BAG_L1. Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 0.65s of the 200.68s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Time limit exceeded... Skipping NeuralNetTorch_BAG_L1. Fitting model: LightGBMLarge_BAG_L1 ... Training model for up to 0.44s of the 200.48s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Ran out of time, early stopping on iteration 1. Best iteration is: [1] valid_set's rmse: 179.334 Time limit exceeded... Skipping LightGBMLarge_BAG_L1. Completed 1/20 k-fold bagging repeats ... Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 199.78s of remaining time. -84.1251 = Validation score (-root_mean_squared_error) 0.73s = Training runtime 0.0s = Validation runtime Fitting 9 L2 models ... Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 198.98s of the 198.96s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000] valid_set's rmse: 60.9213 [2000] valid_set's rmse: 60.0388 [3000] valid_set's rmse: 59.8521 [1000] valid_set's rmse: 61.2639 [2000] valid_set's rmse: 60.3481 [1000] valid_set's rmse: 64.0419 [2000] valid_set's rmse: 62.8485 [1000] valid_set's rmse: 64.4371 [2000] valid_set's rmse: 62.5034 [3000] valid_set's rmse: 62.3424 [1000] valid_set's rmse: 58.7129 [2000] valid_set's rmse: 57.6587 [1000] valid_set's rmse: 63.5234 [2000] valid_set's rmse: 62.3591 [1000] valid_set's rmse: 62.7864 [2000] valid_set's rmse: 61.7307 [3000] valid_set's rmse: 61.6274 [1000] valid_set's rmse: 57.7822 [2000] valid_set's rmse: 57.105
-60.4701 = Validation score (-root_mean_squared_error)
46.59s = Training runtime
3.89s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 144.17s of the 144.15s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
-55.134 = Validation score (-root_mean_squared_error)
12.76s = Training runtime
0.21s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 130.87s of the 130.85s of remaining time.
-53.4515 = Validation score (-root_mean_squared_error)
42.03s = Training runtime
0.72s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 87.48s of the 87.46s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
Ran out of time, early stopping on iteration 1047.
Ran out of time, early stopping on iteration 1188.
Ran out of time, early stopping on iteration 1307.
Ran out of time, early stopping on iteration 1348.
Ran out of time, early stopping on iteration 1445.
-55.4685 = Validation score (-root_mean_squared_error)
78.22s = Training runtime
0.06s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 9.11s of the 9.09s of remaining time.
-53.8593 = Validation score (-root_mean_squared_error)
15.39s = Training runtime
0.85s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -7.67s of remaining time.
-52.8658 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 607.99s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_105322")
Review AutoGluon's training run with ranking of models that did the best.¶
predictor.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -52.865846 13.389465 521.269552 0.000817 0.287105 3 True 15
1 RandomForestMSE_BAG_L2 -53.451478 12.257624 414.610369 0.722348 42.028500 2 True 12
2 ExtraTreesMSE_BAG_L2 -53.859260 12.389017 387.974257 0.853741 15.392388 2 True 14
3 LightGBM_BAG_L2 -55.133954 11.748551 385.337010 0.213275 12.755141 2 True 11
4 CatBoost_BAG_L2 -55.468453 11.599283 450.806419 0.064007 78.224550 2 True 13
5 LightGBMXT_BAG_L2 -60.470103 15.426254 419.170218 3.890978 46.588348 2 True 10
6 KNeighborsDist_BAG_L1 -84.125061 0.104822 0.036808 0.104822 0.036808 1 True 2
7 WeightedEnsemble_L2 -84.125061 0.106286 0.763713 0.001464 0.726904 2 True 9
8 KNeighborsUnif_BAG_L1 -101.546199 0.056908 0.039038 0.056908 0.039038 1 True 1
9 RandomForestMSE_BAG_L1 -116.548359 0.817105 19.042239 0.817105 19.042239 1 True 5
10 ExtraTreesMSE_BAG_L1 -124.600676 0.513394 8.290484 0.513394 8.290484 1 True 7
11 CatBoost_BAG_L1 -130.571286 0.110086 240.143042 0.110086 240.143042 1 True 6
12 LightGBM_BAG_L1 -131.054162 1.394772 12.816886 1.394772 12.816886 1 True 4
13 LightGBMXT_BAG_L1 -131.460909 7.996730 51.984224 7.996730 51.984224 1 True 3
14 NeuralNetFastAI_BAG_L1 -140.319540 0.541459 40.229147 0.541459 40.229147 1 True 8
Number of models trained: 15
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_NNFastAiTabular', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_KNN'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -131.46090891834504,
'LightGBM_BAG_L1': -131.054161598899,
'RandomForestMSE_BAG_L1': -116.54835939455667,
'CatBoost_BAG_L1': -130.57128624162138,
'ExtraTreesMSE_BAG_L1': -124.60067564699747,
'NeuralNetFastAI_BAG_L1': -140.31953985560796,
'WeightedEnsemble_L2': -84.12506123181602,
'LightGBMXT_BAG_L2': -60.47010263935106,
'LightGBM_BAG_L2': -55.133953757189154,
'RandomForestMSE_BAG_L2': -53.45147772878295,
'CatBoost_BAG_L2': -55.46845278345196,
'ExtraTreesMSE_BAG_L2': -53.8592596631687,
'WeightedEnsemble_L3': -52.86584564696691},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': ['KNeighborsUnif_BAG_L1'],
'KNeighborsDist_BAG_L1': ['KNeighborsDist_BAG_L1'],
'LightGBMXT_BAG_L1': ['LightGBMXT_BAG_L1'],
'LightGBM_BAG_L1': ['LightGBM_BAG_L1'],
'RandomForestMSE_BAG_L1': ['RandomForestMSE_BAG_L1'],
'CatBoost_BAG_L1': ['CatBoost_BAG_L1'],
'ExtraTreesMSE_BAG_L1': ['ExtraTreesMSE_BAG_L1'],
'NeuralNetFastAI_BAG_L1': ['NeuralNetFastAI_BAG_L1'],
'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
'LightGBMXT_BAG_L2': ['LightGBMXT_BAG_L2'],
'LightGBM_BAG_L2': ['LightGBM_BAG_L2'],
'RandomForestMSE_BAG_L2': ['RandomForestMSE_BAG_L2'],
'CatBoost_BAG_L2': ['CatBoost_BAG_L2'],
'ExtraTreesMSE_BAG_L2': ['ExtraTreesMSE_BAG_L2'],
'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.03903794288635254,
'KNeighborsDist_BAG_L1': 0.03680849075317383,
'LightGBMXT_BAG_L1': 51.98422408103943,
'LightGBM_BAG_L1': 12.816885948181152,
'RandomForestMSE_BAG_L1': 19.04223918914795,
'CatBoost_BAG_L1': 240.14304184913635,
'ExtraTreesMSE_BAG_L1': 8.290484189987183,
'NeuralNetFastAI_BAG_L1': 40.22914743423462,
'WeightedEnsemble_L2': 0.7269041538238525,
'LightGBMXT_BAG_L2': 46.588348388671875,
'LightGBM_BAG_L2': 12.755140781402588,
'RandomForestMSE_BAG_L2': 42.02849984169006,
'CatBoost_BAG_L2': 78.2245500087738,
'ExtraTreesMSE_BAG_L2': 15.392388105392456,
'WeightedEnsemble_L3': 0.28710460662841797},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.056908369064331055,
'KNeighborsDist_BAG_L1': 0.10482192039489746,
'LightGBMXT_BAG_L1': 7.996730327606201,
'LightGBM_BAG_L1': 1.3947715759277344,
'RandomForestMSE_BAG_L1': 0.8171048164367676,
'CatBoost_BAG_L1': 0.11008644104003906,
'ExtraTreesMSE_BAG_L1': 0.5133936405181885,
'NeuralNetFastAI_BAG_L1': 0.5414590835571289,
'WeightedEnsemble_L2': 0.0014636516571044922,
'LightGBMXT_BAG_L2': 3.8909778594970703,
'LightGBM_BAG_L2': 0.21327519416809082,
'RandomForestMSE_BAG_L2': 0.7223482131958008,
'CatBoost_BAG_L2': 0.06400728225708008,
'ExtraTreesMSE_BAG_L2': 0.853740930557251,
'WeightedEnsemble_L3': 0.0008168220520019531},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -52.865846 13.389465 521.269552
1 RandomForestMSE_BAG_L2 -53.451478 12.257624 414.610369
2 ExtraTreesMSE_BAG_L2 -53.859260 12.389017 387.974257
3 LightGBM_BAG_L2 -55.133954 11.748551 385.337010
4 CatBoost_BAG_L2 -55.468453 11.599283 450.806419
5 LightGBMXT_BAG_L2 -60.470103 15.426254 419.170218
6 KNeighborsDist_BAG_L1 -84.125061 0.104822 0.036808
7 WeightedEnsemble_L2 -84.125061 0.106286 0.763713
8 KNeighborsUnif_BAG_L1 -101.546199 0.056908 0.039038
9 RandomForestMSE_BAG_L1 -116.548359 0.817105 19.042239
10 ExtraTreesMSE_BAG_L1 -124.600676 0.513394 8.290484
11 CatBoost_BAG_L1 -130.571286 0.110086 240.143042
12 LightGBM_BAG_L1 -131.054162 1.394772 12.816886
13 LightGBMXT_BAG_L1 -131.460909 7.996730 51.984224
14 NeuralNetFastAI_BAG_L1 -140.319540 0.541459 40.229147
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000817 0.287105 3 True
1 0.722348 42.028500 2 True
2 0.853741 15.392388 2 True
3 0.213275 12.755141 2 True
4 0.064007 78.224550 2 True
5 3.890978 46.588348 2 True
6 0.104822 0.036808 1 True
7 0.001464 0.726904 2 True
8 0.056908 0.039038 1 True
9 0.817105 19.042239 1 True
10 0.513394 8.290484 1 True
11 0.110086 240.143042 1 True
12 1.394772 12.816886 1 True
13 7.996730 51.984224 1 True
14 0.541459 40.229147 1 True
fit_order
0 15
1 12
2 14
3 11
4 13
5 10
6 2
7 9
8 1
9 5
10 7
11 6
12 4
13 3
14 8 }
Create predictions from test dataset¶
predictions = predictor.predict(test)
predictions = {'datetime': test['datetime'], 'Pred_count': predictions}
predictions = pd.DataFrame(data=predictions)
predictions.head()
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results. X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
| datetime | Pred_count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 23.776628 |
| 1 | 2011-01-20 01:00:00 | 42.635658 |
| 2 | 2011-01-20 02:00:00 | 46.400299 |
| 3 | 2011-01-20 03:00:00 | 49.024002 |
| 4 | 2011-01-20 04:00:00 | 51.627224 |
NOTE: Kaggle will reject the submission if we don't set everything to be > 0.¶
# Describe the `predictions` series to see if there are any negative values
predictions.describe()
| Pred_count | |
|---|---|
| count | 6493.000000 |
| mean | 101.049049 |
| std | 90.335938 |
| min | 3.017506 |
| 25% | 20.798779 |
| 50% | 63.234932 |
| 75% | 169.659332 |
| max | 362.337433 |
# How many negative values do we have?
neg = predictions.groupby(predictions['Pred_count'])
# lambda function
def minus(val):
return val[val < 0].sum()
print(neg['Pred_count'].agg([('negcount', minus)]))
negcount Pred_count 3.017506 0.0 3.048299 0.0 3.086751 0.0 3.100743 0.0 3.101935 0.0 ... ... 361.426758 0.0 361.816986 0.0 362.043579 0.0 362.062012 0.0 362.337433 0.0 [6263 rows x 1 columns]
# Set them to zero
predictions[predictions['Pred_count']<0] = 0
predictions.describe()
| Pred_count | |
|---|---|
| count | 6493.000000 |
| mean | 101.049049 |
| std | 90.335938 |
| min | 3.017506 |
| 25% | 20.798779 |
| 50% | 63.234932 |
| 75% | 169.659332 |
| max | 362.337433 |
predictions.head()
| datetime | Pred_count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 23.776628 |
| 1 | 2011-01-20 01:00:00 | 42.635658 |
| 2 | 2011-01-20 02:00:00 | 46.400299 |
| 3 | 2011-01-20 03:00:00 | 49.024002 |
| 4 | 2011-01-20 04:00:00 | 51.627224 |
Set predictions to submission dataframe, save, and submit¶
submission["count"] = predictions['Pred_count']
submission.to_csv("submission.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "first raw submission"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 612kB/s] Successfully submitted to Bike Sharing Demand
View submission via the command line or in the web browser under the competition's page - My Submissions¶
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore -------------- ------------------- -------------------- -------- ----------- ------------ submission.csv 2024-04-29 11:12:02 first raw submission complete 1.7998 1.7998
Initial score of 1.7998¶
Step 4: Exploratory Data Analysis and Creating an additional feature¶
- Any additional feature will do, but a great suggestion would be to separate out the datetime into hour, day, or month parts.
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploritory data analysis
train.hist()
array([[<Axes: title={'center': 'season'}>,
<Axes: title={'center': 'holiday'}>,
<Axes: title={'center': 'workingday'}>],
[<Axes: title={'center': 'weather'}>,
<Axes: title={'center': 'temp'}>,
<Axes: title={'center': 'atemp'}>],
[<Axes: title={'center': 'humidity'}>,
<Axes: title={'center': 'windspeed'}>,
<Axes: title={'center': 'casual'}>],
[<Axes: title={'center': 'registered'}>,
<Axes: title={'center': 'count'}>, <Axes: >]], dtype=object)
# create a new feature
# Assuming train and test are your DataFrames
train['datetime'] = pd.to_datetime(train['datetime'])
test['datetime'] = pd.to_datetime(test['datetime'])
# Now, you can access the components of the datetime column
train['year'] = train['datetime'].dt.year
train['month'] = train['datetime'].dt.month
train['day'] = train['datetime'].dt.day
train['hour'] = train['datetime'].dt.hour
test['year'] = test['datetime'].dt.year
test['month'] = test['datetime'].dt.month
test['day'] = test['datetime'].dt.day
test['hour'] = test['datetime'].dt.hour
Make category types for these so models know they are not just numbers¶
- AutoGluon originally sees these as ints, but in reality they are int representations of a category.
- Setting the dtype to category will classify these as categories in AutoGluon.
train["season"] = train["season"].astype("category")
train["weather"] = train["weather"].astype("category")
test["season"] = test["season"].astype("category")
test["weather"] = test["weather"].astype("category")
# View are new feature
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | year | month | day | hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 | 2011 | 1 | 1 | 0 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 | 2011 | 1 | 1 | 1 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 | 2011 | 1 | 1 | 2 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 | 2011 | 1 | 1 | 3 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 | 2011 | 1 | 1 | 4 |
# View histogram of all features again now with the hour feature
train.hist()
array([[<Axes: title={'center': 'datetime'}>,
<Axes: title={'center': 'holiday'}>,
<Axes: title={'center': 'workingday'}>,
<Axes: title={'center': 'temp'}>],
[<Axes: title={'center': 'atemp'}>,
<Axes: title={'center': 'humidity'}>,
<Axes: title={'center': 'windspeed'}>,
<Axes: title={'center': 'casual'}>],
[<Axes: title={'center': 'registered'}>,
<Axes: title={'center': 'count'}>,
<Axes: title={'center': 'year'}>,
<Axes: title={'center': 'month'}>],
[<Axes: title={'center': 'day'}>,
<Axes: title={'center': 'hour'}>, <Axes: >, <Axes: >]],
dtype=object)
Step 5: Rerun the model with the same settings as before, just with more features¶
predictor_new_features = TabularPredictor(
label="count", problem_type="regression", eval_metric="rmse"
).fit(
train_data=train.drop(['casual', 'registered'], axis=1),
time_limit=600,
presets='best_quality')
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_111603"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_111603"
AutoGluon Version: 0.8.2
Python Version: 3.10.14
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail: 3.05 GB / 5.36 GB (57.0%)
WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception.
We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows: 10886
Train Data Columns: 13
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context("mode.use_inf_as_na", True): # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 1952.55 MB
Train Data (Original) Memory Usage: 0.81 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 7 | ['holiday', 'workingday', 'humidity', 'year', 'month', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 4 | ['humidity', 'month', 'day', 'hour']
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 3 | ['datetime', 'datetime.year', 'datetime.dayofweek']
1.4s = Fit runtime
13 features in original data used to generate 15 features in processed data.
Train Data (Processed) Memory Usage: 0.8 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 1.47s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 398.92s of the 598.52s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.04s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 398.79s of the 598.4s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.05s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.66s of the 598.27s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000] valid_set's rmse: 35.722 [2000] valid_set's rmse: 34.0646 [3000] valid_set's rmse: 33.7501 [4000] valid_set's rmse: 33.5663 [5000] valid_set's rmse: 33.5927 [1000] valid_set's rmse: 36.6943 [2000] valid_set's rmse: 34.7009 [3000] valid_set's rmse: 34.2654 [4000] valid_set's rmse: 34.0805 [5000] valid_set's rmse: 34.0068 [6000] valid_set's rmse: 33.9926 [7000] valid_set's rmse: 34.0148 [8000] valid_set's rmse: 34.0505 [1000] valid_set's rmse: 37.0225 [2000] valid_set's rmse: 34.5264 [3000] valid_set's rmse: 33.9428 [4000] valid_set's rmse: 33.6752 [5000] valid_set's rmse: 33.5411 [6000] valid_set's rmse: 33.4628 [7000] valid_set's rmse: 33.3908 [8000] valid_set's rmse: 33.3862 [9000] valid_set's rmse: 33.3645 [10000] valid_set's rmse: 33.3686 [1000] valid_set's rmse: 38.1752 [2000] valid_set's rmse: 36.5188 [3000] valid_set's rmse: 36.1264 [4000] valid_set's rmse: 35.9954 [5000] valid_set's rmse: 35.9337 [6000] valid_set's rmse: 35.9463 [1000] valid_set's rmse: 38.9031 [2000] valid_set's rmse: 36.7896 [3000] valid_set's rmse: 36.3287 [4000] valid_set's rmse: 36.2175 [5000] valid_set's rmse: 36.1359 [6000] valid_set's rmse: 36.0948 [7000] valid_set's rmse: 36.174 [1000] valid_set's rmse: 35.8977 [2000] valid_set's rmse: 33.4992 [3000] valid_set's rmse: 32.7907 [4000] valid_set's rmse: 32.4471 [5000] valid_set's rmse: 32.2892 [6000] valid_set's rmse: 32.2846 [7000] valid_set's rmse: 32.2649 [8000] valid_set's rmse: 32.3084 [1000] valid_set's rmse: 38.3394 [2000] valid_set's rmse: 37.1199 [3000] valid_set's rmse: 36.8417 [4000] valid_set's rmse: 36.6798 [5000] valid_set's rmse: 36.6466 [6000] valid_set's rmse: 36.6288 [7000] valid_set's rmse: 36.6832 [1000] valid_set's rmse: 35.8969 [2000] valid_set's rmse: 34.1606 [3000] valid_set's rmse: 33.8527 [4000] valid_set's rmse: 33.714 [5000] valid_set's rmse: 33.6917
-34.4539 = Validation score (-root_mean_squared_error) 82.09s = Training runtime 14.9s = Validation runtime Fitting model: LightGBM_BAG_L1 ... Training model for up to 290.28s of the 489.88s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000] valid_set's rmse: 33.1713 [2000] valid_set's rmse: 33.0077 [1000] valid_set's rmse: 32.8635 [2000] valid_set's rmse: 32.6404 [1000] valid_set's rmse: 31.9543 [2000] valid_set's rmse: 31.343 [3000] valid_set's rmse: 30.9039 [4000] valid_set's rmse: 30.8612 [1000] valid_set's rmse: 35.8483 [2000] valid_set's rmse: 35.4773 [3000] valid_set's rmse: 35.3993 [1000] valid_set's rmse: 35.5388 [1000] valid_set's rmse: 31.6283 [1000] valid_set's rmse: 37.9327 [2000] valid_set's rmse: 37.4577 [1000] valid_set's rmse: 34.9434 [2000] valid_set's rmse: 34.6719
-33.9173 = Validation score (-root_mean_squared_error) 30.95s = Training runtime 3.36s = Validation runtime Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 252.17s of the 451.77s of remaining time. -38.425 = Validation score (-root_mean_squared_error) 20.96s = Training runtime 0.61s = Validation runtime Fitting model: CatBoost_BAG_L1 ... Training model for up to 229.93s of the 429.54s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Ran out of time, early stopping on iteration 2051. Ran out of time, early stopping on iteration 2220. Ran out of time, early stopping on iteration 2283. Ran out of time, early stopping on iteration 2382. Ran out of time, early stopping on iteration 2447. Ran out of time, early stopping on iteration 2516. Ran out of time, early stopping on iteration 2871. Ran out of time, early stopping on iteration 3160. -34.342 = Validation score (-root_mean_squared_error) 220.46s = Training runtime 0.12s = Validation runtime Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 9.24s of the 208.84s of remaining time. -38.1073 = Validation score (-root_mean_squared_error) 9.76s = Training runtime 0.8s = Validation runtime Completed 1/20 k-fold bagging repeats ... Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 197.54s of remaining time. -32.2137 = Validation score (-root_mean_squared_error) 0.51s = Training runtime 0.0s = Validation runtime Fitting 9 L2 models ... Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 197.0s of the 196.99s of remaining time. Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
[1000] valid_set's rmse: 30.5043 [1000] valid_set's rmse: 31.5792
-31.1511 = Validation score (-root_mean_squared_error)
19.62s = Training runtime
1.04s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 175.13s of the 175.11s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
-30.5897 = Validation score (-root_mean_squared_error)
14.36s = Training runtime
0.26s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 160.18s of the 160.17s of remaining time.
-31.6744 = Validation score (-root_mean_squared_error)
51.4s = Training runtime
0.82s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 107.29s of the 107.28s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
Ran out of time, early stopping on iteration 866.
Ran out of time, early stopping on iteration 946.
Ran out of time, early stopping on iteration 931.
Ran out of time, early stopping on iteration 901.
Ran out of time, early stopping on iteration 1000.
Ran out of time, early stopping on iteration 1066.
Ran out of time, early stopping on iteration 1134.
Ran out of time, early stopping on iteration 1403.
-30.5154 = Validation score (-root_mean_squared_error)
102.8s = Training runtime
0.1s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 4.31s of the 4.3s of remaining time.
-31.4851 = Validation score (-root_mean_squared_error)
15.74s = Training runtime
0.74s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -12.76s of remaining time.
-30.2404 = Validation score (-root_mean_squared_error)
0.36s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 613.17s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_111603")
predictor_new_features.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.240410 22.095974 552.836447 0.001155 0.359319 3 True 14
1 CatBoost_BAG_L2 -30.515401 19.975629 467.097923 0.096180 102.799073 2 True 12
2 LightGBM_BAG_L2 -30.589689 20.138117 378.655426 0.258667 14.356575 2 True 10
3 LightGBMXT_BAG_L2 -31.151067 20.918398 383.922330 1.038949 19.623480 2 True 9
4 ExtraTreesMSE_BAG_L2 -31.485109 20.621979 380.036663 0.742529 15.737813 2 True 13
5 RandomForestMSE_BAG_L2 -31.674417 20.701023 415.698001 0.821574 51.399150 2 True 11
6 WeightedEnsemble_L2 -32.213694 19.040947 355.014108 0.000823 0.511402 2 True 8
7 LightGBM_BAG_L1 -33.917339 3.360040 30.952478 3.360040 30.952478 1 True 4
8 CatBoost_BAG_L1 -34.341995 0.121933 220.460180 0.121933 220.460180 1 True 6
9 LightGBMXT_BAG_L1 -34.453884 14.897718 82.090137 14.897718 82.090137 1 True 3
10 ExtraTreesMSE_BAG_L1 -38.107278 0.795041 9.756352 0.795041 9.756352 1 True 7
11 RandomForestMSE_BAG_L1 -38.424984 0.612650 20.955973 0.612650 20.955973 1 True 5
12 KNeighborsDist_BAG_L1 -84.125061 0.047784 0.043936 0.047784 0.043936 1 True 2
13 KNeighborsUnif_BAG_L1 -101.546199 0.044284 0.039792 0.044284 0.039792 1 True 1
Number of models trained: 14
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_CatBoost', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_KNN'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 2 | ['season', 'weather']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 4 | ['humidity', 'month', 'day', 'hour']
('int', ['bool']) : 3 | ['holiday', 'workingday', 'year']
('int', ['datetime_as_int']) : 3 | ['datetime', 'datetime.year', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -34.453884062670745,
'LightGBM_BAG_L1': -33.91733862651761,
'RandomForestMSE_BAG_L1': -38.424983594881716,
'CatBoost_BAG_L1': -34.34199492944324,
'ExtraTreesMSE_BAG_L1': -38.10727767243523,
'WeightedEnsemble_L2': -32.2136936832968,
'LightGBMXT_BAG_L2': -31.151066802192368,
'LightGBM_BAG_L2': -30.589688521755814,
'RandomForestMSE_BAG_L2': -31.674416659292678,
'CatBoost_BAG_L2': -30.5154005834992,
'ExtraTreesMSE_BAG_L2': -31.485108815338208,
'WeightedEnsemble_L3': -30.24040961851195},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': ['KNeighborsUnif_BAG_L1'],
'KNeighborsDist_BAG_L1': ['KNeighborsDist_BAG_L1'],
'LightGBMXT_BAG_L1': ['LightGBMXT_BAG_L1'],
'LightGBM_BAG_L1': ['LightGBM_BAG_L1'],
'RandomForestMSE_BAG_L1': ['RandomForestMSE_BAG_L1'],
'CatBoost_BAG_L1': ['CatBoost_BAG_L1'],
'ExtraTreesMSE_BAG_L1': ['ExtraTreesMSE_BAG_L1'],
'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
'LightGBMXT_BAG_L2': ['LightGBMXT_BAG_L2'],
'LightGBM_BAG_L2': ['LightGBM_BAG_L2'],
'RandomForestMSE_BAG_L2': ['RandomForestMSE_BAG_L2'],
'CatBoost_BAG_L2': ['CatBoost_BAG_L2'],
'ExtraTreesMSE_BAG_L2': ['ExtraTreesMSE_BAG_L2'],
'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.03979229927062988,
'KNeighborsDist_BAG_L1': 0.04393649101257324,
'LightGBMXT_BAG_L1': 82.09013748168945,
'LightGBM_BAG_L1': 30.952478408813477,
'RandomForestMSE_BAG_L1': 20.955973148345947,
'CatBoost_BAG_L1': 220.46018028259277,
'ExtraTreesMSE_BAG_L1': 9.756352186203003,
'WeightedEnsemble_L2': 0.511401891708374,
'LightGBMXT_BAG_L2': 19.623480081558228,
'LightGBM_BAG_L2': 14.35657525062561,
'RandomForestMSE_BAG_L2': 51.399150371551514,
'CatBoost_BAG_L2': 102.79907274246216,
'ExtraTreesMSE_BAG_L2': 15.737812995910645,
'WeightedEnsemble_L3': 0.35931873321533203},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.04428362846374512,
'KNeighborsDist_BAG_L1': 0.04778409004211426,
'LightGBMXT_BAG_L1': 14.897717952728271,
'LightGBM_BAG_L1': 3.360039710998535,
'RandomForestMSE_BAG_L1': 0.6126501560211182,
'CatBoost_BAG_L1': 0.1219325065612793,
'ExtraTreesMSE_BAG_L1': 0.7950413227081299,
'WeightedEnsemble_L2': 0.0008225440979003906,
'LightGBMXT_BAG_L2': 1.0389490127563477,
'LightGBM_BAG_L2': 0.2586674690246582,
'RandomForestMSE_BAG_L2': 0.8215737342834473,
'CatBoost_BAG_L2': 0.09617972373962402,
'ExtraTreesMSE_BAG_L2': 0.7425293922424316,
'WeightedEnsemble_L3': 0.0011551380157470703},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.240410 22.095974 552.836447
1 CatBoost_BAG_L2 -30.515401 19.975629 467.097923
2 LightGBM_BAG_L2 -30.589689 20.138117 378.655426
3 LightGBMXT_BAG_L2 -31.151067 20.918398 383.922330
4 ExtraTreesMSE_BAG_L2 -31.485109 20.621979 380.036663
5 RandomForestMSE_BAG_L2 -31.674417 20.701023 415.698001
6 WeightedEnsemble_L2 -32.213694 19.040947 355.014108
7 LightGBM_BAG_L1 -33.917339 3.360040 30.952478
8 CatBoost_BAG_L1 -34.341995 0.121933 220.460180
9 LightGBMXT_BAG_L1 -34.453884 14.897718 82.090137
10 ExtraTreesMSE_BAG_L1 -38.107278 0.795041 9.756352
11 RandomForestMSE_BAG_L1 -38.424984 0.612650 20.955973
12 KNeighborsDist_BAG_L1 -84.125061 0.047784 0.043936
13 KNeighborsUnif_BAG_L1 -101.546199 0.044284 0.039792
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001155 0.359319 3 True
1 0.096180 102.799073 2 True
2 0.258667 14.356575 2 True
3 1.038949 19.623480 2 True
4 0.742529 15.737813 2 True
5 0.821574 51.399150 2 True
6 0.000823 0.511402 2 True
7 3.360040 30.952478 1 True
8 0.121933 220.460180 1 True
9 14.897718 82.090137 1 True
10 0.795041 9.756352 1 True
11 0.612650 20.955973 1 True
12 0.047784 0.043936 1 True
13 0.044284 0.039792 1 True
fit_order
0 14
1 12
2 10
3 9
4 13
5 11
6 8
7 4
8 6
9 3
10 7
11 5
12 2
13 1 }
predictions_new_features = predictor_new_features.predict(test)
predictions_new_features = {'datetime': test['datetime'], 'Pred_count': predictions_new_features}
predictions_new_features = pd.DataFrame(data=predictions_new_features)
predictions_new_features.head()
| datetime | Pred_count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 15.958500 |
| 1 | 2011-01-20 01:00:00 | 11.255019 |
| 2 | 2011-01-20 02:00:00 | 10.674420 |
| 3 | 2011-01-20 03:00:00 | 9.340588 |
| 4 | 2011-01-20 04:00:00 | 7.769901 |
# Remember to set all negative values to zero
predictions_new_features[predictions_new_features['Pred_count']<0] = 0
predictions_new_features.describe()
| datetime | Pred_count | |
|---|---|---|
| count | 6493 | 6493.000000 |
| mean | 2012-01-13 09:27:47.765285632 | 154.844589 |
| min | 2011-01-20 00:00:00 | 1.780506 |
| 25% | 2011-07-22 15:00:00 | 53.331760 |
| 50% | 2012-01-20 23:00:00 | 119.993599 |
| 75% | 2012-07-20 17:00:00 | 220.551758 |
| max | 2012-12-31 23:00:00 | 816.064575 |
| std | NaN | 133.741531 |
# Same submitting predictions
submission_new_features = pd.read_csv('submission.csv')
submission_new_features["count"] = predictions_new_features['Pred_count']
submission_new_features.to_csv("submission_new_features.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features.csv -m "new features"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 660kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore --------------------------- ------------------- -------------------- -------- ----------- ------------ submission_new_features.csv 2024-04-29 11:42:11 new features complete 0.68259 0.68259 submission.csv 2024-04-29 11:12:02 first raw submission complete 1.7998 1.7998
New Score of 0.68259?¶
Step 6: Hyper parameter optimization¶
- There are many options for hyper parameter optimization.
- Options are to change the AutoGluon higher level parameters or the individual model hyperparameters.
- The hyperparameters of the models themselves that are in AutoGluon. Those need the
hyperparameterandhyperparameter_tune_kwargsarguments.
import autogluon.core as ag
from autogluon.common import space
from autogluon.tabular import TabularPredictor
nn_options = {
'dropout_prob': space.Real(0.0, 0.5, default=0.1), # dropout probability
}
gbm_options = {
'num_boost_round': 100, # number of boosting rounds
'num_leaves': space.Int(lower=26, upper=66, default=36), # number of leaves in trees
}
hyperparameters = { # hyperparameters of each model type
'GBM': gbm_options,
'NN_TORCH': nn_options,
}
num_trials = 3 # try at most 3 different hyperparameter configurations for each type of model
search_strategy = 'auto' # tune hyperparameters using Bayesian optimization routine with a local scheduler
hyperparameter_tune_kwargs = {
'num_trials': num_trials,
'scheduler' : 'local',
'searcher': search_strategy,
}
predictor_new_hpo = TabularPredictor(
label="count", problem_type="regression", eval_metric="rmse"
).fit(
train_data=train.drop(['casual', 'registered'], axis=1),
time_limit=600,
presets='best_quality', hyperparameters=hyperparameters, hyperparameter_tune_kwargs=hyperparameter_tune_kwargs)
No path specified. Models will be saved in: "AutogluonModels/ag-20240429_124008"
Presets specified: ['best_quality']
Warning: hyperparameter tuning is currently experimental and may cause the process to hang.
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
/opt/conda/lib/python3.10/site-packages/pkg_resources/__init__.py:2832: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
declare_namespace(pkg)
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20240429_124008"
AutoGluon Version: 0.8.2
Python Version: 3.10.14
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Sat Mar 23 09:49:55 UTC 2024
Disk Space Avail: 1.63 GB / 5.36 GB (30.4%)
WARNING: Available disk space is low and there is a risk that AutoGluon will run out of disk during fit, causing an exception.
We recommend a minimum available disk space of 10 GB, and large datasets may require more.
Train Data Rows: 10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
/opt/conda/lib/python3.10/site-packages/autogluon/tabular/learner/default_learner.py:215: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
with pd.option_context("mode.use_inf_as_na", True): # treat None, NaN, INF, NINF as NA
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 2228.55 MB
Train Data (Original) Memory Usage: 1.52 MB (0.1% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results.
X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
('object', ['datetime_as_object']) : 1 | ['datetime']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.7s = Fit runtime
9 features in original data used to generate 13 features in processed data.
Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.8s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'GBM': {'num_boost_round': 100, 'num_leaves': Int: lower=26, upper=66},
'NN_TORCH': {'dropout_prob': Real: lower=0.0, upper=0.5},
}
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 2 L1 models ...
Hyperparameter tuning model: LightGBM_BAG_L1 ... Tuning model for up to 179.71s of the 599.15s of remaining time.
0%| | 0/3 [00:00<?, ?it/s]
Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3` Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy /opt/conda/lib/python3.10/site-packages/dask/dataframe/_pyarrow_compat.py:17: FutureWarning: Minimal version of pyarrow will soon be increased to 14.0.1. You are using 12.0.1. Please consider upgrading. warnings.warn( /opt/conda/lib/python3.10/site-packages/dask/dataframe/__init__.py:31: FutureWarning: Dask dataframe query planning is disabled because dask-expr is not installed. You can install it with `pip install dask[dataframe]` or `conda install dask`. This will raise in a future version. warnings.warn(msg, FutureWarning) Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Fitted model: LightGBM_BAG_L1/T1 ... -135.4732 = Validation score (-root_mean_squared_error) 6.0s = Training runtime 0.0s = Validation runtime Fitted model: LightGBM_BAG_L1/T2 ... -135.0295 = Validation score (-root_mean_squared_error) 5.3s = Training runtime 0.0s = Validation runtime Fitted model: LightGBM_BAG_L1/T3 ... -134.1941 = Validation score (-root_mean_squared_error) 4.53s = Training runtime 0.0s = Validation runtime Hyperparameter tuning model: NeuralNetTorch_BAG_L1 ... Tuning model for up to 179.71s of the 575.81s of remaining time. Will use custom hpo logic because ray import failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.6.3`
0%| | 0/3 [00:00<?, ?it/s]
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Ran out of time, stopping training early. (Stopping on epoch 35) Ran out of time, stopping training early. (Stopping on epoch 40) Ran out of time, stopping training early. (Stopping on epoch 47) Ran out of time, stopping training early. (Stopping on epoch 51) Ran out of time, stopping training early. (Stopping on epoch 56) Ran out of time, stopping training early. (Stopping on epoch 69) Stopping HPO to satisfy time limit... Fitted model: NeuralNetTorch_BAG_L1/T1 ... -142.041 = Validation score (-root_mean_squared_error) 161.57s = Training runtime 0.0s = Validation runtime Completed 1/20 k-fold bagging repeats ... Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 414.12s of remaining time. -134.003 = Validation score (-root_mean_squared_error) 1.05s = Training runtime 0.0s = Validation runtime Fitting 2 L2 models ... Hyperparameter tuning model: LightGBM_BAG_L2 ... Tuning model for up to 185.84s of the 412.93s of remaining time.
0%| | 0/3 [00:00<?, ?it/s]
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy Fitted model: LightGBM_BAG_L2/T1 ... -134.4063 = Validation score (-root_mean_squared_error) 5.23s = Training runtime 0.0s = Validation runtime Fitted model: LightGBM_BAG_L2/T2 ... -134.1484 = Validation score (-root_mean_squared_error) 3.48s = Training runtime 0.0s = Validation runtime Fitted model: LightGBM_BAG_L2/T3 ... -134.5661 = Validation score (-root_mean_squared_error) 5.08s = Training runtime 0.0s = Validation runtime Hyperparameter tuning model: NeuralNetTorch_BAG_L2 ... Tuning model for up to 185.84s of the 398.52s of remaining time.
0%| | 0/3 [00:00<?, ?it/s]
Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy
Stopping HPO to satisfy time limit...
Fitted model: NeuralNetTorch_BAG_L2/T1 ...
-137.7589 = Validation score (-root_mean_squared_error)
109.39s = Training runtime
0.0s = Validation runtime
Repeating k-fold bagging: 2/20
Fitting model: LightGBM_BAG_L2/T1 ... Training model for up to 289.07s of the 289.02s of remaining time.
Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
-134.001 = Validation score (-root_mean_squared_error)
9.29s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBM_BAG_L2/T2 ... Training model for up to 284.82s of the 284.77s of remaining time.
Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
-133.7499 = Validation score (-root_mean_squared_error)
6.75s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L2/T3 ... Training model for up to 281.31s of the 281.25s of remaining time.
Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
-134.2248 = Validation score (-root_mean_squared_error)
8.88s = Training runtime
0.08s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L2/T1 ... Training model for up to 277.27s of the 277.23s of remaining time.
Fitting 8 child models (S2F1 - S2F8) | Fitting with SequentialLocalFoldFittingStrategy
-137.5566 = Validation score (-root_mean_squared_error)
225.87s = Training runtime
0.21s = Validation runtime
Completed 2/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the 160.47s of remaining time.
-133.5982 = Validation score (-root_mean_squared_error)
0.68s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 440.32s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240429_124008")
predictor_new_hpo.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -133.598151 0.378901 419.581011 0.001148 0.681137 3 True 10
1 LightGBM_BAG_L2/T2 -133.749929 0.087237 184.153805 0.086385 6.749786 2 True 7
2 LightGBM_BAG_L2/T1 -134.001034 0.070539 186.694348 0.069688 9.290329 2 True 6
3 WeightedEnsemble_L2 -134.002986 0.003735 172.454185 0.003287 1.053466 2 True 5
4 LightGBM_BAG_L1/T3 -134.194130 0.000139 4.532218 0.000139 4.532218 1 True 3
5 LightGBM_BAG_L2/T3 -134.224763 0.077487 186.282038 0.076636 8.878019 2 True 8
6 LightGBM_BAG_L1/T2 -135.029528 0.000109 5.302512 0.000109 5.302512 1 True 2
7 LightGBM_BAG_L1/T1 -135.473207 0.000403 6.003300 0.000403 6.003300 1 True 1
8 NeuralNetTorch_BAG_L2/T1 -137.556585 0.214733 403.272069 0.213881 225.868050 2 True 9
9 NeuralNetTorch_BAG_L1/T1 -142.041032 0.000201 161.565988 0.000201 161.565988 1 True 4
Number of models trained: 10
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_TabularNeuralNetTorch', 'StackerEnsembleModel_LGB'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
*** End of fit() summary ***
/opt/conda/lib/python3.10/site-packages/autogluon/core/utils/plots.py:169: UserWarning: AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"
warnings.warn('AutoGluon summary plots cannot be created because bokeh is not installed. To see plots, please do: "pip install bokeh==2.0.1"')
{'model_types': {'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T3': 'StackerEnsembleModel_LGB',
'NeuralNetTorch_BAG_L1/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2/T3': 'StackerEnsembleModel_LGB',
'NeuralNetTorch_BAG_L2/T1': 'StackerEnsembleModel_TabularNeuralNetTorch',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'LightGBM_BAG_L1/T1': -135.4732072756916,
'LightGBM_BAG_L1/T2': -135.02952795945737,
'LightGBM_BAG_L1/T3': -134.19413006667938,
'NeuralNetTorch_BAG_L1/T1': -142.04103202606262,
'WeightedEnsemble_L2': -134.00298642499382,
'LightGBM_BAG_L2/T1': -134.00103350983295,
'LightGBM_BAG_L2/T2': -133.74992855442642,
'LightGBM_BAG_L2/T3': -134.22476301488,
'NeuralNetTorch_BAG_L2/T1': -137.55658482834824,
'WeightedEnsemble_L3': -133.59815093816746},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'LightGBM_BAG_L1/T1': ['LightGBM_BAG_L1', 'T1'],
'LightGBM_BAG_L1/T2': ['LightGBM_BAG_L1', 'T2'],
'LightGBM_BAG_L1/T3': ['LightGBM_BAG_L1', 'T3'],
'NeuralNetTorch_BAG_L1/T1': ['NeuralNetTorch_BAG_L1', 'T1'],
'WeightedEnsemble_L2': ['WeightedEnsemble_L2'],
'LightGBM_BAG_L2/T1': ['LightGBM_BAG_L2', 'T1'],
'LightGBM_BAG_L2/T2': ['LightGBM_BAG_L2', 'T2'],
'LightGBM_BAG_L2/T3': ['LightGBM_BAG_L2', 'T3'],
'NeuralNetTorch_BAG_L2/T1': ['NeuralNetTorch_BAG_L2', 'T1'],
'WeightedEnsemble_L3': ['WeightedEnsemble_L3']},
'model_fit_times': {'LightGBM_BAG_L1/T1': 6.003300428390503,
'LightGBM_BAG_L1/T2': 5.3025124073028564,
'LightGBM_BAG_L1/T3': 4.5322184562683105,
'NeuralNetTorch_BAG_L1/T1': 161.56598782539368,
'WeightedEnsemble_L2': 1.0534660816192627,
'LightGBM_BAG_L2/T1': 9.290328741073608,
'LightGBM_BAG_L2/T2': 6.749785900115967,
'LightGBM_BAG_L2/T3': 8.878019332885742,
'NeuralNetTorch_BAG_L2/T1': 225.8680498600006,
'WeightedEnsemble_L3': 0.6811370849609375},
'model_pred_times': {'LightGBM_BAG_L1/T1': 0.00040268898010253906,
'LightGBM_BAG_L1/T2': 0.00010943412780761719,
'LightGBM_BAG_L1/T3': 0.00013875961303710938,
'NeuralNetTorch_BAG_L1/T1': 0.00020051002502441406,
'WeightedEnsemble_L2': 0.003286600112915039,
'LightGBM_BAG_L2/T1': 0.06968808174133301,
'LightGBM_BAG_L2/T2': 0.08638525009155273,
'LightGBM_BAG_L2/T3': 0.07663559913635254,
'NeuralNetTorch_BAG_L2/T1': 0.213881254196167,
'WeightedEnsemble_L3': 0.0011477470397949219},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'LightGBM_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'NeuralNetTorch_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -133.598151 0.378901 419.581011
1 LightGBM_BAG_L2/T2 -133.749929 0.087237 184.153805
2 LightGBM_BAG_L2/T1 -134.001034 0.070539 186.694348
3 WeightedEnsemble_L2 -134.002986 0.003735 172.454185
4 LightGBM_BAG_L1/T3 -134.194130 0.000139 4.532218
5 LightGBM_BAG_L2/T3 -134.224763 0.077487 186.282038
6 LightGBM_BAG_L1/T2 -135.029528 0.000109 5.302512
7 LightGBM_BAG_L1/T1 -135.473207 0.000403 6.003300
8 NeuralNetTorch_BAG_L2/T1 -137.556585 0.214733 403.272069
9 NeuralNetTorch_BAG_L1/T1 -142.041032 0.000201 161.565988
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001148 0.681137 3 True
1 0.086385 6.749786 2 True
2 0.069688 9.290329 2 True
3 0.003287 1.053466 2 True
4 0.000139 4.532218 1 True
5 0.076636 8.878019 2 True
6 0.000109 5.302512 1 True
7 0.000403 6.003300 1 True
8 0.213881 225.868050 2 True
9 0.000201 161.565988 1 True
fit_order
0 10
1 7
2 6
3 5
4 3
5 8
6 2
7 1
8 9
9 4 }
prediction_new_hpo = predictor_new_hpo.predict(test)
prediction_new_hpo = {'datetime': test['datetime'], 'Pred_count': prediction_new_hpo}
prediction_new_hpo = pd.DataFrame(data=prediction_new_hpo)
prediction_new_hpo.head()
/opt/conda/lib/python3.10/site-packages/autogluon/features/generators/fillna.py:58: FutureWarning: The 'downcast' keyword in fillna is deprecated and will be removed in a future version. Use res.infer_objects(copy=False) to infer non-object dtype, or pd.to_numeric with the 'downcast' keyword to downcast numeric results. X.fillna(self._fillna_feature_map, inplace=True, downcast=False)
| datetime | Pred_count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 72.947769 |
| 1 | 2011-01-20 01:00:00 | 50.996433 |
| 2 | 2011-01-20 02:00:00 | 50.996471 |
| 3 | 2011-01-20 03:00:00 | 66.710915 |
| 4 | 2011-01-20 04:00:00 | 66.710983 |
# Remember to set all negative values to zero
prediction_new_hpo[prediction_new_hpo['Pred_count']<0] = 0
# Same submitting predictions
submission_new_hpo = pd.read_csv('submission.csv')
submission_new_hpo["count"] = prediction_new_hpo['Pred_count']
submission_new_hpo.to_csv("submission_new_hpo.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo.csv -m "new features with hyperparameters"
1179.07s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 576kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
1188.97s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
fileName date description status publicScore privateScore --------------------------- ------------------- --------------------------------- -------- ----------- ------------ submission_new_hpo.csv 2024-04-29 12:53:53 new features with hyperparameters complete 1.29634 1.29634 submission_new_features.csv 2024-04-29 11:42:11 new features complete 0.68259 0.68259 submission.csv 2024-04-29 11:12:02 first raw submission complete 1.7998 1.7998
New Score of 1.29634¶
# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
{
"model": ["initial", "add_features", "hpo"],
"score": [?, ?, ?]
}
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_train_score.png')
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
{
"test_eval": ["initial", "add_features", "hpo"],
"score": [1.7998, 0.68259, 1.29634]
}
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_test_score.png')
Hyperparameter table¶
# The 3 hyperparameters we tuned with the kaggle score as the result
hyperparams_df = pd.DataFrame({
"model": ["initial_model", "add_features_model", "hpo_model"],
"hpo1": ['default_vals', 'default_vals', 'GBM: num_leaves: lower=26, upper=66'],
"hpo2": ['default_vals', 'default_vals', 'NN_TORCH: dropout_prob: 0.0, 0.5'],
"hpo3": ['default_vals', 'default_vals', 'GBM: num_boost_round: 100'],
"score": [1.7998, 0.68259, 1.29634]
})
hyperparams_df.head()
| model | hpo1 | hpo2 | hpo3 | score | |
|---|---|---|---|---|---|
| 0 | initial_model | default_vals | default_vals | default_vals | 1.79980 |
| 1 | add_features_model | default_vals | default_vals | default_vals | 0.68259 |
| 2 | hpo_model | GBM: num_leaves: lower=26, upper=66 | NN_TORCH: dropout_prob: 0.0, 0.5 | GBM: num_boost_round: 100 | 1.29634 |
def plot_series(time, series, format="-", start=0, end=None, label=None):
plt.plot(time[start:end], series[start:end], format, label=label)
plt.xlabel("Time")
plt.ylabel("Value")
if label:
plt.legend(fontsize=14)
plt.grid(True)
sub_new = pd.read_csv('submission_new_features.csv')
import matplotlib.pyplot as plt
series = train["count"].to_numpy()
time = train["datetime"].to_numpy()
plt.figure(figsize=(350, 15))
plot_series(time, series)
plt.title("Train Data time series graph")
#plot_series(time1, series1)
plt.show()
sub_new.loc[:, "datetime"] = pd.to_datetime(sub_new.loc[:, "datetime"])
series1 = sub_new["count"].to_numpy()
time1 = sub_new["datetime"].to_numpy()
plt.figure(figsize=(350, 15))
#plot_series(time, series)
plot_series(time1, series1)
plt.title("Test Data time series graph")
plt.show()
!jupyter nbconvert --to html project_notebook.ipynb
2269.41s - pydevd: Sending message related to process being replaced timed-out after 5 seconds
[NbConvertApp] WARNING | pattern 'project_notebook.ipynb' matched no files
This application is used to convert notebook files (*.ipynb)
to various other formats.
WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.
Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
<cmd> --help-all
--debug
set log level to logging.DEBUG (maximize logging output)
Equivalent to: [--Application.log_level=10]
--show-config
Show the application's configuration (human-readable format)
Equivalent to: [--Application.show_config=True]
--show-config-json
Show the application's configuration (json format)
Equivalent to: [--Application.show_config_json=True]
--generate-config
generate default config file
Equivalent to: [--JupyterApp.generate_config=True]
-y
Answer yes to any questions instead of prompting.
Equivalent to: [--JupyterApp.answer_yes=True]
--execute
Execute the notebook prior to export.
Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
Write notebook output to stdout instead of files.
Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
Run nbconvert in place, overwriting the existing notebook (only
relevant when converting to notebook format)
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
Clear output of current file and save in place,
overwriting the existing notebook.
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--coalesce-streams
Coalesce consecutive stdout and stderr outputs into one stream (within each cell).
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --CoalesceStreamsPreprocessor.enabled=True]
--no-prompt
Exclude input and output prompts from converted document.
Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
Exclude input cells and output prompts from converted document.
This mode is ideal for generating code-free reports.
Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
Whether to allow downloading chromium if no suitable version is found on the system.
Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
Disable chromium security sandbox when converting to PDF..
Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
Shows code input. This flag is only useful for dejavu users.
Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
Equivalent to: [--HTMLExporter.embed_images=True]
--sanitize-html
Whether the HTML in Markdown cells and cell outputs should be sanitized..
Equivalent to: [--HTMLExporter.sanitize_html=True]
--log-level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
Equivalent to: [--Application.log_level]
--config=<Unicode>
Full path of a config file.
Default: ''
Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
The export format to be used, either one of the built-in formats
['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf']
or a dotted object name that represents the import path for an
``Exporter`` class
Default: ''
Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
Name of the template to use
Default: ''
Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
Name of the template file to use
Default: None
Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
as prebuilt extension for the lab template)
Default: 'light'
Equivalent to: [--HTMLExporter.theme]
--sanitize_html=<Bool>
Whether the HTML in Markdown cells and cell outputs should be sanitized.This
should be set to True by nbviewer or similar tools.
Default: False
Equivalent to: [--HTMLExporter.sanitize_html]
--writer=<DottedObjectName>
Writer class used to write the
results of the conversion
Default: 'FilesWriter'
Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
PostProcessor class used to write the
results of the conversion
Default: ''
Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
Overwrite base name use for output files.
Supports pattern replacements '{notebook_name}'.
Default: '{notebook_name}'
Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
Directory to write output(s) to. Defaults
to output to the directory of each notebook. To recover
previous default behaviour (outputting to the current
working directory) use . as the flag value.
Default: ''
Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
The URL prefix for reveal.js (version 3.x).
This defaults to the reveal CDN, but can be any url pointing to a copy
of reveal.js.
For speaker notes to work, this must be a relative path to a local
copy of reveal.js: e.g., "reveal.js".
If a relative path is given, it must be a subdirectory of the
current directory (from which the server is run).
See the usage documentation
(https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
for more details.
Default: ''
Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
The nbformat version to write.
Use this to downgrade notebooks.
Choices: any of [1, 2, 3, 4]
Default: 4
Equivalent to: [--NotebookExporter.nbformat_version]
Examples
--------
The simplest way to use nbconvert is
> jupyter nbconvert mynotebook.ipynb --to html
Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'qtpdf', 'qtpng', 'rst', 'script', 'slides', 'webpdf'].
> jupyter nbconvert --to latex mynotebook.ipynb
Both HTML and LaTeX support multiple output templates. LaTeX includes
'base', 'article' and 'report'. HTML includes 'basic', 'lab' and
'classic'. You can specify the flavor of the format used.
> jupyter nbconvert --to html --template lab mynotebook.ipynb
You can also pipe the output to stdout, rather than a file
> jupyter nbconvert mynotebook.ipynb --stdout
PDF is generated via latex
> jupyter nbconvert mynotebook.ipynb --to pdf
You can get (and serve) a Reveal.js-powered slideshow
> jupyter nbconvert myslides.ipynb --to slides --post serve
Multiple notebooks can be given at the command line in a couple of
different ways:
> jupyter nbconvert notebook*.ipynb
> jupyter nbconvert notebook1.ipynb notebook2.ipynb
or you can specify the notebooks list in a config file, containing::
c.NbConvertApp.notebooks = ["my_notebook.ipynb"]
> jupyter nbconvert --config mycfg.py
To see all available configurables, use `--help-all`.
cd ~/documents
[Errno 2] No such file or directory: '/home/sagemaker-user/documents' /home/sagemaker-user/cd0385-project-starter